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ESTIMATING  PERCENTILES  OF  NONNORMAI.  ANTHROPOMETRIC  POPULATIONS: 

FINAL  REPORT 

H.  F.  Mart/..  Jr. 


ABSTRACT 

The  most  commonly  used  method  for  estimating  percentiles  of  anthropometric  populations  is  based  on  the 
assumption  that  the  population  is  normally  distributed.  This  assumption  is  approximately  true  for  many  such 
variables,  e.g.,  hip' breadth.  On  the  other  hand,  numerous  nonnormally  distributed  anthropometric  populations 
are  known  to  exist,  e.g.,  grip  strength.  The  question  of  how  to  estimate  percentiles  of  nonnormal  populations 
is  addressed  here. 

A nonparametric  percentile  estimation  method,  based  on  the  use  of  a kernel-type  probability  density 
function  estimator,  is  presented.  A "nonparametric"  method  is  defined  as  a method  that  docs  not  make  or 
require  any  assumption  about  the  statistical  distribution  of  the  underlying  population.  Thus,  the  method  can 
be  applied  to  any  population  of  anthropometric  data,  regardless  of  the  normality  of  the  data.  The  method  is 
simple  to  use;  however,  a single  nonlinear  equation  must  be  numerically  solved  on  a computer  by  any  one  of 
numerous  well-documented  nonlinear  root  finding  methods. 

Two  examples  are  used  to  illustrate  the  method.  In  the  first  example,  selected  samples  of  size  50  of  hip 
breadth  data  are  randomly  drawn  from  a population  of  size  2420  observations  from  the  1967  anthropometric 
survey  of  U.S.  Air  Force  flying  personnel.  The  proposed  method  is  compared  to  the  standard  gaussian  method. 
Since  this  population  was  selected  as  normally  distributed,  the  standard  method  outperforms  the  proposed 
nonparametric  method.  In  the  case  of  grip-strength  data,  the  proposed  method  yields  more  accurate  estimates, 
in  a mean  squared  error  sense,  of  the  upper  percentiles  of  this  population.  For  anthropometric  distributions 
known  to  be  nonnormal  or  where  normality  cannot  be  assumed,  the  proposed  nonparametric  method  appears 
a method  for  consideration. 


INTRODUCTION 

It  is  common  practice  to  characterize  anthropometric 
design  limits  in  terms  of  suitable  percentiles  of  a population 
of  interest.  A percentile  gives  the  percentage  of  persons 
within  the  population  who  have  a body  dimension  of  a 
certain  size  or  smaller.  There  are  two  commonly  used 
methods  for  estimating  percentiles.  The  first  method  is  a 
simple  well-known  counting  procedure.  The  data  are  ar- 
ranged in  ascending  order  of  size,  and  then  are  grouped 
into  convenient  class  intervals.  Finally,  the  number  of 
measurements  below  each  upper  class  limit  are  counted, 
divided  by  the  total  number  of  measurements,  and  multi- 
plied by  100  to  determine  the  percentile  rank.  This  method 
may  be  used  cither  for  the  entire  population  or  for  a 
sample  from  the  population.  In  the  case  of  sample  data, 
the  computed  percentile  is  an  estimate  of  the  (true)  under- 
lying population  percentile.  As  a consequence,  it  is  subject 
to  certain  statistical  errors. 

The  second  commonly  used  method  is  to  assume  that 
the  anthropometric  measurement  of  interest  is  normally 
distributed.  The  mean  and  variance  of  this  distribution  are 
then  used  in  conjunction  with  stated  “factors"  to  estimate 
the  desired  percentiles.  The  method  requires  that  either 
the  entire  population  of  measurements  is  available  or  that 
the  sample  size  is  sufficiently  large.  The  required  "factors" 
are  provided  by  Roebuck  (1957).  and  a complete  descrip- 
tion may  be  found  in  the  book  by  Roebuck.  Krocmer,  and 
Thomson  (1975,  pp.  132-144). 


Most  human  factors  investigators  are  aware  of  the 
existence  of  certain  anthropometric  populations  which  are 
nonnormally  distributed.  An  example  of  such  a popula- 
tion will  be  presented  later.  The  question  of  how  to  esti- 
mate percentiles  of  such  populations  is  an  important  one. 
The  purpose  of  this  paper  is  to  piescnt  a method  which  can 
be  used  to  obtain  cither  population  percentiles  or  percent- 
ile estimates  for  any  anthropometric  population.  The 
method  is  a nonparametric  one,  which  means  that  it  does 
not  assume  specific  knowledge  of  the  statistical  distribution, 
e.g.,  the  normal  distribution,  of  the  measurement  of  in- 
terest. Thus,  the  method  is  particularly  appropriate  for  use 
in  populations  which  arc  cither  not  known  to  be  normal  or 
known  to  be  nonnormal. 

METHOD 

Background 

Over  the  past  two  decades  there  have  been  some  im- 
portant developments  in  the  area  of  statistical  theory 
known  as  "nonparametric  probability  density  function 
estimation."  Wcgman  (1972a)  presents  a thorough  sum- 
mary survey  of  the  historical  developments  in  this  area.  In 
short,  the  basic  idea  is  to  provide  an  estimator  which  can 
be  used  to  estimate  the  complete  underlying  probability 
density  function . based  on  a sample  from  the  population, 
without  the  necessity  to  first  estimate  certain  "parameters" 
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of  the  population  such  as  the  mean,  variance,  etc.  How- 
ever. such  characteristics  may  be  estimated  once  the  esti- 
mated probability  density  function  has  been  obtained.  Of 
particular  interest  here  are  the  percentiles  of  such  an  esti- 
mated piobability  density  function  The  particular  proba- 
bility density  function  estimator  considered  here  is  attrib- 
uted to  Rosenblatt  (1956)  and  Par/en  (1902)  Suppose 
that  we  have  a ranJom  sample  \| , x,,  . . xn  of  si/.e  n 
from  some  population  having  unknown  arid  unspecified 
probability  density  function  f(st.  Following  Rosenblatt 
(195b)  and  Par/en  (1902),  the  estimator  of  Kx).  f„(x).  may 
in  general  be  fepresented 


where  s is  the  sample  standard  deviation  of  the  measurement 

X| xn,  are  selected  for  use  here.  Although  other 

functions  could  be  considered,  this  choice  is  known  to  pro- 
duce good  results  | Bennett  (I970)| . Thus,  the  probability 
density  function  estimator  to  be  used  here  is  given  by 


fn(x)  ■ 


OS 

nh 


£ Kl 


(I) 


where  M)  is  a suitably  chosen  function,  referred  to  as  the 
“kernel,”  "smoothing."  or  "window"  function,  and  h s 
htni  is  a suitably  chosen  function  of  n in  which  it  is  re- 
quired that  h • 0 and  nh  • » is  n • « The  kernel  function 
K must  also  satisfy  critain  conditions  which  are  given  in 
Par/en  (1962)  Based  on  the  work  of  Par/en  (1962).  Weg- 
nian  (1 972a.  b),  and  Beimel  (1970),  the  particular  K and  h 
given  by 


To  better  understand  this  estimator,  figures  I and  2 give  a 
plot  of  (3)  based  on  a random  sample  of  si/e  100  observa- 
tions from  a symmetric  (approximately  gaussiun)  distribu- 
tion and  an  asymmetric  right-skewed  distribution,  respec- 
tively. Both  ftx)  and  fn(x)  are  shown  for  comparison,  and 
arbitrary  scales  were  chosen  for  x . It  is  observed  in  both 
cases  that  the  estimates  provide  reasonably  close  approxi- 
mations to  the  true  densities. 


Development 


K(x)  * 0 5 exp  (-  Ixl), 


(2) 


and 


Of  interest  here  arc  the  percentiles  of  fn(x)  given  in  (3), 
since  Ihcsc  are  the  desired  estimates  of  the  population  per- 
centiles of  f(x).  Let  xQ  represent  the  100  (ahh  percentile  of 
f (x)  given  in  (3).  That  is.  for  a specified  value  of  a,  xQ 
satisfies  the  equation  given  by 

rxa 

I fn(x)  dx  = a.  (4) 
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Figure  I.  The  Fstimatc  f (x)  of  an  Underlying  Gaussian  Density  ftx). 
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Figure  2.  The  Estimate  fn(x)of  an  Underlying  Right-Skewed  Density  f(x). 


Thus.  x0  is  the  required  percentile  estimate  of  the  popula- 
tion percentile  value  for  which  100  lai '}  of  ihe  anthropo- 
metric measurements  do  not  exceed  this  value.  For  exam- 
ple. if  a = 0.95.  then  x095  is  the  required  estimate  of  the  95th 
population  percentile. 

Substituting  <3»  into  <4».  integrating,  and  simplifying 
gives  xa  as  the  solution  to  the  nonlinear  equation  given  by 


1 * h In  exp  ,Vh>] 


- h In  J exp  fix,  ♦ x0,/2h  - ix,  - *al.'2h  - x,/h) 
♦ 2Z  e\p(  ix,  - xal/3h  - K,  ♦ xal/:h  + x,/h  - 2nQ 

i-l 


Although  this  equation  looks  formidable,  it  may  be  easily 
and  efficiently  solved  for  \(|  on  a computer  by  means  of 
any  one  of  numerous  well-documented  nonlinear  root 
finding  subroutines. 

EXAMPLES 

The  percentile  estimation  method  presented  here  was 


used  to  estimate  selected  percentiles  for  certain  anthropo- 
metric variables  in  the  survey  of  USAF  flying  personnel 
conducted  by  CTauser.  Alexander,  and  Kennedy  (l°67). , In 
this  survey.  185  variables  were  finally  selected  and  recorded 
for  2420  male  pilots.  Two  of  these  anthropometric  vari- 
ables will  be  considered  here,  namely,  hip  breadth  and  grip 

First,  let  us  consider  hip  breadth.  The  population 
mean  and  standard  deviation  of  all  hip  breadth  measure- 
ments are  respectively  computed  to  be  352  millimeter  (mm) 
and  19  mm.  Based  on  a random  sample  of  500  hip  breadth 
measurements,  the  hypothesis  of  normality  of  this  popula- 
tion is  accepted  by  the  chi -square,  Kolmogorov- Smirnov, 
and  Cramer-Von  Mises  goodnes-of-fit  tests  at  the  20  per- 
cent level  of  significance.  Thus,  this  population  will  be 
considered  to  be  "normally  distributed." 

Let  us  examine  the  performance  of  the  percentile 
estimation  method  presented  here  and  compare  the  perfor- 
mance with  the  gaussian  percentile  estimation  method.  Of 
course,  the  gaussian  method  is  expected  to  yield  superior 
results  in  this  case  which  may  be  thought  of  as  a worst 
case  analysis  for  the  alternative  percentile  estimation 
method  presented  here.  The  procedure  was  as  follows: 

Ten  successive  random  samples  each  of  size  50  were  drawn 
without  replacement  from  the  population  of  hip  breadth 
measurements.  The  50.  55.  b0.  h5.  70.  75.  80.  85.  90.  95. 
97.5.  99.  and  99.5th  percentile  estimates  were  computed 
for  each  sample  by  both  the  gaussian  method  [see  Roebuck. 
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Kroemcr,  and  Thomason  ( 1975 >|  and  equation  (5).  In  order 
(o  compare  »lre  performance  of  bolli  methods,  the  corre- 
sponding population  percentiles  were  computed  for  all  2420 
observations  by  means  of  the  counting  method  described 
earlier  These  population  percentiles  were  taken  to  be  the 
standard  reference  values  The  average  nonparametric  per- 
centile estimates,  gaussian  estimates,  and  corresponding 
nopulation  percentiles  were  then  computed  from  the  ten 
samples  and  are  presented  in  table  I.  In  addition,  the 
average  squared  error  between  each  estimate  and  the  corre- 
sponding population  percentile  value  was  computed  for  both 
methods  and  is  also  given  in  table  I.  As  observed,  the 
gaussian  method  is  superior  for  estimating  the  percentiles  of 
the  hip  breadth  population,  particularly  for  the  70th  and 
larger  percentiles.  This  is  the  result  of  utilmng  the  added 
information  of  normality  in  the  gaussian  method  when,  in 
fact,  the  population  is  indeed  a "normally  distributed"  one. 
Kecall  that  this  assumption  is  not  made  when  using  equa- 
tion (5). 


Now  let  us  consider  grip  strength.  The  population 
mean  and  standard  deviation  of  all  grip  strength  measure- 
ments arc  computed  to  be  5.G  pounds  and  7.6  pounds,  re- 
spectively Based  on  a random  sample  of  500  grip  strength 
measurements,  the  hypothesis  of  normality  of  this  popula- 
tion is  rejected  by  both  the  chi-square,  and  Cramet-Von 
Mises  goodness- of- fit  tests  at  the  10  percent  level  of  sig- 
nificance. Thus,  this  population  is  not  normally  distributed 
as  was  the  case  for  the  hip  breadth  distribution. 

Let  us  now  examine  the  performance  of  the  percentile 
estimation  method  presented  here  and  again  compare  the 
performance  with  the  gaussian  method.  The  same  mannei 
of  comparison  was  used  as  for  the  hip  breadth  population 
data.  Table  2 illustrates  the  results  of  the  comparison.  It 
is  observed  that  the  nonparametric  estimator  outperforms 
the  gaussian  estimator  when  estimating  the  97.5,  99,  and 
99.5th  percentiles. 


Table  1.  Average  Percentile  Fstimates  and  Average  Squared  Lrror  Performance 
for  the  Hip  Breadth  Population  Data 


Percentile 

Gaussian 

Estimate 

Nonparametric 

Estimate 

Population 

Percentile 

Average  Squatcd  Error 
(Gaussian  Estimate) 

Average  Squared  Lrror 
(Nonparametric  Estimate) 

500 

354.79 

354.78 

352 

19  08 

19.03 

55.0 

357.15 

3S7.1 7 

354 

22.11 

22.17 

60  0 

359.54 

359.59 

356 

25.66 

25.94 

65.0 

362  01 

362.19 

359 

23.38 

24.32 

700 

364.62 

364.94 

362 

22.53 

24.70 

750 

367.43 

368  01 

365 

23.22 

26.37 

800 

370.58 

371.72 

367 

32.17 

42.77 

85.0 

374  22 

376.19 

371 

32.31 

51.90 

900 

378  84 

381.94 

376 

33.66 

67.13 

95.0 

385.65 

390  1 5 

385 

32.31 

66.63 

97.5 

391  55 

397.36 

392 

38.33 

75.86 

99.0 

398  42 

406.15 

402 

59.12 

79.65 

99.5 

403.11 

412.09 



76.38 

88.61 

Table  2.  Average  Percentile  Fstimates  and  Average  Squared  Lrror  Performance 
for  the  Grip  Strength  Population  Data 


Percentile 

Gaussian 

Estimate 

Nonparametric 

Estimate 

Population 

Percentile 

Average  Squared  l rror 
(Gaussian  Estimate) 

Average  Squared  l rror 
(Nonparametric  Estimate) 

500 

56.35 

56.35 

56 

0.82 

0.82 

55  0 

57.28 

57.28 

57 

0.90 

0.90 

60.0 

58.22 

58.22 

58 

1.02 

1.02 

65.0 

59.19 

59.19 

59 

1.19 

1.19 

70.0 

60  22 

60.22 

60 

1.42 

1.42 

75  0 

61.32 

61  42 

61 

1.76 

1.90 

80  0 

62.56 

62.93 

63 

2.20 

2.14 

85  0 

63.99 

64.70 

64 

2.47 

3.23 

900 

65.80 

66.99 

66 

3.19 

4.99 

95.0 

68.48 

70.60 

70 

6.64 

7.69 

97.5 

70  80 

73.61 

73 

10.35 

9.21 

99.0 

73.50 

76.90 

76 

13.36 

10  90 

99.5 

75.34 

79.22 

78 

15.38 

13.31 

4 


CONCIUSIONS 


In  conclusion.  based  on  (lie  limited  comparison  just 
described,  it  is  conjectured  tlut  the  nonpjramctric  percent- 
ile estimator  Mill  outperform  the  Russian  estimator  for 
nonnortnal  populations.  Although  extensively  investigated, 
the  degree  of  performance  improvement  appears  to  be 
proportional  to  the  degree  of  nonnormalily  future  effort 
needs  to  be  directed  toward  an  extensive  Monte  Carlo 
simulation  for  further  examination  of  the  proposed  non- 
parametric  percentile  estimator. 
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